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BACKGROUND: The goal of this systematic review and meta-analysis was to estimate the rate of compliance with assisted reproductive 
technologies (ART) and examine its relationship with treatment success rates. 

METHODS! Six databases were systematically searched from 1978 to December 201 I. Studies were included if they reported data on 
patient progression through three consecutive standard ART cycles. Compliance was estimated for the first three ART cycles (typical 
ART Regimen Compliance, TARC) and after the first and the second failed cycles (CAFI , CAF2). Treatment success rates for all patients 
who started ART and for those who fully complied with the three ART cycles were estimated. 

RESULTS: Ten studies with data for 14 810 patients were included. TARC was 78.2% [95% confidence interval (CI) 68.8-85.3%], CAFI 
was 8 1 .8% (73.3-88. 1 %) and CAF2 was 75.3% (68.2-8 1 .2%). The overall success rate was 42.7% (32.6-53.6%) for all patients starting ART 
and 57.9% (49.4-65.9%) for those who complied with three ART cycles. Compliance rates did not vary according to study quality, but TARC 
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was higher for studies that reported data on doctor-censored patients versus those that did not (84.2% 95% CI 75.5-90.2 versus 70.6% 95% 
CI 58.3-80.5, P = 0.043). Analysis of funnel plots and the Egger test indicated publication bias for CAFI. 

CONCLUSIONS: Findings from this meta-analysis should reassure clinics and patients that most patients are able to comply with three 
cycles of ART. Compliers could increase their chances of success by as much as 1 5%. A more detailed assessment of compliance requires 
monitoring long-term treatment trajectories through the creation of national registries. 

Key words: assisted reproductive technologies / compliance rates / discontinuation / success rates 



Introduction 

Most couples have life plans that include having children but 9- 1 5% 
will have problems conceiving spontaneously (Boivin et al., 2007). In- 
fertility is a significant impairment of function, which the first World 
Disability Survey ranks as 5th in the list of moderate to severe disabil- 
ities within the global population under the age of 60 (World Health 
Organization and The World Bank, 201 I). Fortunately, the chances 
of achieving parenthood are high for couples undergoing fertility treat- 
ment. The world live birth rate with assisted reproductive technologies 
(ART, e.g. IVF) is 22% per single initiated cycle of treatment 
(de Mouzon et al., 2009) but can be 49% (Stem et al., 2010) or higher 
(Witsenburg et al., 2005; Verhagen et al., 2008) if people undergo 
the optimal number of cycles, typically three [National Institute for 
Clinical Excellence (NICE), 2004, p. 5]. However, many couples do 
not undergo multiple cycles of ART, even when there is a favourable 
prognosis and ability to cover the costs of treatment (Domar 2004; 
Brandes et al, 2009). Indeed, discontinuation rates as high as 65% 
mainly due to psychological demands of treatment (Smeenk et al., 
2004; Brandes et al., 2009) have been reported (Rajkhowa et al., 
2006). Practice guidelines and national regulations emphasize the im- 
portance of discussing treatment success rates but not the rates of 
discontinuation [National Institute for Clinical Excellence (NICE), 
2004; European Society of Human Reproduction and Embryology 
(ESHRE), 2008; The Practice Committee of the Society for Assisted 
Reproductive Technology and the Practice Committee of the Ameri- 
can Society for Reproductive Medicine, 2008]. Recently, the UK Na- 
tional Institute for Clinical Excellence (NICE) recommended using 
compliance as a way of auditing treatment delivery at clinics [National 
Institute for Clinical Excellence (NICE), 2004, p.42], but to our knowl- 
edge, this has not been done. The World Health Organization 
(WHO) defines treatment compliance (or adherence) as '. . . the 
extent to which a person's behaviour follows medical advice or corre- 
sponds with agreed recommendations from a health care provider. . .' 
(WHO, 2003, p. 3). In medical practice, in general, compliance means 
'the degree of constancy and accuracy with which a patient follows 
a prescribed regimen' (http://medical-dictionary.thefreedictionary 
.com/compliance). Therefore, in ART, compliance would refer to 
the uptake of the ART cycles recommended by the doctor until preg- 
nancy is achieved or until there is a recommendation to end treatment 
(as well as compliance with medication, which is not addressed in the 
present review.) Although the terminology is compatible with the con- 
cepts of shared and informed decision-making on the part of the 
patient, there has been a reluctance to conceptualize discontinuation 
in ART as a compliance issue or to influence patient decision-making 
about pursuing treatment. Reference to compliance is made implicitly, 



when clinicians mention cumulative pregnancy rates or offer financial 
packages that take into account better success rates with multiple 
cycles (Garrido et al., 201 I); however, few patients recall having the 
opportunity to discuss the advantages (24%) or disadvantages (18%) 
of ending/continuing treatment (Peddie et al., 2004). The lack of em- 
phasis on compliance in fertility treatment may be due to several 
factors. Unlike other disease contexts, people can opt out of fertility 
treatment without threatening their physical health and opting out 
can at times have beneficial consequences, for example on mental 
health (Peddie et al., 2005). Active intervention to encourage compli- 
ance could also be avoided because of popular conceptions of fertility 
doctors taking advantage of desperate infertile couples (Thompson, 
2005). However, even if doctors want to discuss compliance with 
their patients, they lack precise information as its prevalence has not 
yet been systematically estimated from the available literature. What- 
ever the cause, providing explicit information about compliance at the 
start of treatment (e.g. compliance rate, consequence of ending/con- 
tinuing treatment on success rate) is essential for informed consent; 
otherwise, patients begin treatment optimistic about success 
without fully realizing that the demands of treatment (e.g. physical, 
emotional and practical) may be such that they are unable to 
pursue the optimal number of cycles even when their prognosis is 
favourable and costs of treatment are covered (Domar, 2004; 
McDowell and Murray, 201 I). 

There is high variability in the discontinuation rate reported in 
primary research, ranging from 15% (Brandes et al., 2009) to 65% 
(Rajkhowa et al., 2006), which makes it difficult to be confident 
about compliance. Variability may, in large part, be explained by the 
lack of consensus on the definition and monitoring of compliance; 
for example, in many studies the non-complier group includes poor 
prognosis patients who discontinued treatment because they were 
advised to stop treatment (De Vries et al., 1999), some studies 
monitor patients for too short a follow-up period to accurately con- 
clude on compliance (Land et al., 1 997) and most studies do 
not control for patients who continue treatment at different clinics 
(Stolwijk et al., 1996; Verhagen, et al. 2008) or at a later time in 
their lives (Pearson et al., 2009). Other issues that contribute to vari- 
ability in the compliance rate reported are treatment reimbursement 
policy, the type of population under study (e.g. previous experience 
with ART and parity), the type of ART treatment investigated and 
other methodological aspects (e.g. design, assessment of treatment 
initiation and success). Another important issue is that primary 
research has shown that ART success rates cannot be accurately 
estimated without considering discontinuation (Land et al., 1997), 
and therefore, the aforementioned issues would also impact on the 
reporting of success rates in ART. Further, the clinics' success rates 
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may also influence compliance as past research has shown that people 
move to clinics perceived to have higher pregnancy rates to improve 
their chances of success (Marcus et al., 2005). A systematic review 
taking into account these issues would help achieve greater clarity 
on compliance in ART and its association with treatment success 
rates. 

The aims of the present systematic review and meta-analysis were 
3-fold. The first goal was to provide the first estimate of compliance 
among typical infertile patients undergoing standard ART treatment. 
In order to promote future consensus on how to define, monitor 
and report compliance, the second goal was to examine conceptual 
and methodological causes of variability in compliance. Finally, the 
third goal was to assess how compliance is associated with treatment 
success rates. 

Methods 

Systematic search 

The present work is part of a larger review that investigated reasons and 
predictors of discontinuation from fertility treatment (Gameiro et al., 
2012). The Sure Support Unit for Research Evidence (Cardiff University) 
searched six databases (Medline, Medline In Progress, EMBASE, BNI, 
PsyclNFO and The Cochrane Library) from 1978 to December 201 I 
(inclusive). A search strategy was created using terminology from the 
International Committee for Monitoring Assisted Reproductive Technol- 
ogy and the WHO-revised glossary of ART (Zegers-Hochschild et al., 
2009) for fertility treatment (e.g. ART, IVF) AND discontinuation (e.g. 
dropout, compliance and discontinuation), which, with small adaptations, 
was used in all databases (see Supplementary data, Table SI). MeSH 
terms were used in PubMed. No restriction was made on the type 
(journal, conference paper or dissertation) or language of publication. 
The reference sections of all identified articles were examined by S.G. 
and a research specialist (Debbie Moss, see funding) to identify other rele- 
vant manuscripts. 

Inclusion and exclusion criteria 

Studies were included if data were reported (or could be obtained from 
the corresponding author) on patient progression through a maximum 
of three consecutive standard ART (IVF or ICSI) cycles (i.e. number of 
patients starting, pregnant, discontinuing, continuing after failed treatment) 
or, if fewer, until pregnancy or until the clinician recommended the patient 
to end treatment (i.e. doctor censoring, where this information was pro- 
vided). Three cycles were used because it is the typically recommended 
and/or subsidized number of cycles that patients face for an optimal 
chance of pregnancy in an ART programme [National Institute for Clinical 
Excellence (NICE), 2004]. Only studies that focused on patients with no 
previous experience of ART were included. Studies that solely investigated 
single groups (e.g. third-party reproduction, recurrent miscarriage) or spe- 
cific ART treatment (e.g. modified natural IVF, transport IVF/ICSI) were 
also excluded to focus on the typical ART population. Duplicate or sec- 
ondary publications on the same sample were excluded to avoid multiple- 
publication bias. In these cases, we prioritized the publication that focused 
on discontinuation from treatment and, if this criterion did not apply, the 
publication that reported data for the largest sample. Excluded studies 
were classified according to reason for exclusion (see Fig. I). 

Data extraction 

S.G. and a research specialist (D.B.) extracted data using a standardized 
protocol. Disagreement was resolved by discussion. Data were extracted 



or obtained from the corresponding author on characteristics of the 
study (e.g. country of origin, design), study population (e.g. average 
female age), clinical protocol (e.g. type of ART), health context (e.g. 
availability of subsidized/reimbursed treatment) and methodology (e.g. 
duration of follow-up period, inclusion or exclusion of cryopreserved 
IVF cycles in data reported). The data extracted to calculate the compli- 
ance rates were the numbers of patients who started treatment, who 
had successful or failed treatment, who were recommended to end 
treatment by their doctor (i.e. doctor censoring, where provided) and 
who discontinued or continued after a failed cycle. For those studies 
that reported on doctor censoring, data on its medical indication were 
also extracted. 

Quality assessment 

S.G., J.B. and C.M.V. assessed study quality according to the Newcastle- 
Ottawa Quality (NOQ) assessment scale (Wells et al., 2010) adapted for 
the present study. The NOQ is used to appraise quality in terms of popu- 
lation representativeness, measurement of outcome (compliance), within- 
population comparability (compliers versus discontinuers) and adequacy of 
follow-up (completion rates). The specific criteria used for quality assess- 
ment were already described elsewhere (Gameiro et al., 2012). Low-, 
moderate- and high-quality labels were assigned to scores of 0-2, 3-5 
and 6-7, respectively (see Supplementary data, Table SVI). 

Data analysis 

Studies differed in terms of the number of subsidized treatment cycles and 
the number of cycles followed up. To control for this variability, we based 
our compliance calculations on the treatment uptake for the first three 
ART cycles. Uptake of the first cycle was 1 00% because studies only followed 
up patients who did a first cycle. We assumed that after failure on first or 
second cycles, patients would be expected to undertake a further cycle 
unless they were recommended to end treatment (i.e. doctor censoring). 

Ideally, treatment success should be defined as achievement of a live 
birth. However, that is often not the case in primary research. Thus, treat- 
ment success (versus failure) was defined according to the success 
outcome reported in the primary study, which could be a (3-hCG urine 
or blood test <2I days after embryo transfer, an ultrasonographic visual- 
ization of fetal heart activity or a live birth, as per standard definitions 
(Zegers-Hochschild et al., 2009). 

Three compliance rates were calculated per study: Typical ART 
Regimen Compliance (TARC) and compliance after the first and the 
second failed cycles (Compliance After-Failure, CAFI, CAF2). 

The TARC rate referred to patients who complied with all treatments 
recommended to them, that is, patients who continued with treatment for 
up to three cycles or until treatment success (as defined) or until advised 
to end treatment (i.e. doctor censoring). TARC was the sum of the 
number of patients who opted to undergo all three cycles when they 
failed on the first and the second cycles and of patients who stopped treat- 
ment either because it was successful or because they were censored by 
the doctor (where data on doctor censoring was reported), divided by the 
total number starting ART: 

TARC = [number of patients who underwent three cycles + number 
pregnant or with live birth + number doctor censored 
(if reported)]/number started 

The CAF rates provided an after-failure examination of compliance, that is, of 
patients who opted to undergo a further cycle after having had a failed cycle 
and therefore was the sum of the number of patients undergoing a further 
cycle divided by the number of patients with a failed cycle. Compliance after- 
failure was calculated for the first (CAFI) and the second (CAF2) failed 
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Records identified through 
database searching 
(n = 1713) 



Records after duplicates 
removed 
(n = 1128) 



Records screened 
(n = 1128) 



Records excluded based 
on title and abstract 
(n = 913) 



Full-text articles 
assessed for eligibility 
(n = 215) 




Full-text articles excluded, 
with reasons 
(n = 202) 


► 



Full-text articles 
(n = 13) 



Additional records 
identified through 
other sources 
(n = l) 



Full-text articles included 
in quantitative synthesis 
(meta-analysis) 
(n = 14) 

10 studies 



Reasons for exclusion 

• No data on discontinuation (n = 107) 

• Data on discontinuation but not during 
ART (n = 33) 

• Data on discontinuation during ART but 
not on outcomes investigated (n = 32) 

• Data on discontinuation during ART but 
insufficient or inconsistent (n = 10) 

• Data on discontinuation during ART but 
decision due to other matters (n = 1) 

• Data on discontinuation during ART but 
not on the typical ART population (n = 6) 

• Qualitative paper (n = 2) 

• Review, letter to editor, etc. (n = 11) 



Figure I Decision flowchart for identified studies. 



cycles. Doctor-censored patients were excluded from the calculation of 
compliance after-failure rates (where such data were reported) because 
these patients would have been recommended to stop treatment, and there- 
fore were not eligible for cycle uptake. The following formulas were used: 

CAFI = (number of patients 

who underwent second 
cycle)/[number failed 
first cycle — number doctor 
censored after first cycle (if reported)] 



CAF2 = (number of patients who underwent 
third cycle)/[number failed 
second cycle — number doctor 
censored after 
second cycle (if reported)] 

To examine whether the clinic's success rates per cycle were associated with 
compliance after those cycles, we computed treatment success rates per 
cycle (first and second cycle) for each study. As all but one study (Rufat 
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etal., 1 994) (excluded from this analysis) were single centre, this was equiva- 
lent to providing first and second cycle success rates for each clinic. The rates 
per cycle (first and second cycle) were the number of patients with a success- 
ful outcome in the first or the second cycle divided by the number of patients 
who underwent the first or the second cycle. 

To investigate how treatment success rates varied when compliance 
was taken into account, we calculated an overall success rate, which 
was the number of patients with a successful outcome in the first 
three ART cycles divided by the number of all patients who started the 
first ART cycle. We then calculated a separate typical regimen success 
rate that included only compilers (as defined in preceding TARC 
formula), that is, the number of patients with a successful outcome in 
the first three ART cycles divided by the number of compliers. Therefore, 
for each study, we had three types of success rates: clinic success rates per 
cycle [first and second cycles, excluding the study by Rufat et al. (1994)], 
overall success rate and success rate for compliers. 

In order to correct for variations in study sample size, pooled estimates 
across studies were obtained by means of random-effects models, after log 
transformation. We chose a random-effects model because single-group 
meta-analysis produces substantial heterogeneity. The I 2 index was used 
to describe the proportion of total variation in study estimates that was 
due to heterogeneity (Higgins et al., 2003). Subgroup and meta-regression 
analyses based on the random-model were performed to identify causes 
of heterogeneity in compliance rates among studies. Causes were 
defined a priori and referred to characteristics related to the studies' clinical 
aspects [clinic geographic location, clinic's success rates per cycle 
(assessed only for CAF), number of embryo transfer policy and whether 
treatment was subsidized/reimbursed], patients (parity) and methodology 
(study design, handling of doctor censoring, length of follow-up, definition 
of start-of-cycle and success, handling of cryopreserved IVF cycles, quality 
rating, year of publication). The y 2 test was used to assess differences 
between the subgroups and the significance of the meta-regression coeffi- 
cients were assessed with a Z-test. Publication bias was examined via 
visual inspection of the funnel plots (of the natural log of the rates 
against its standard error) and the Egger's test (Egger et at., 1997). Trim 
and fill was used to adjust the pooled rates for the presence of publication 
bias (Duval and Tweedie, 2000). We used the Comprehensive Meta Ana- 
lysis software (Biostat Inc, 201 I). 

Results 

Description of studies 

The systematic search yielded I 128 non-duplicated records. Figure I 
presents the study decision flow chart. S.G. and D.B. agreed inclusion 
on all studies and agreed on reasons for exclusion for 91% of studies 
(see Table II of supplemental material for reasons for exclusion of full 
manuscripts screened). The authors of the 10 papers with missing or 
inconsistent data and of 5 other included papers were contacted to 
obtain missing data from the manuscripts. Four authors replied 
stating that the requested data were not available. 

The 10 included studies sampled 14 810 patients from five coun- 
tries. The population characteristics and design features of the 
studies are shown in Tables I and II. See Supplementary data, 
Tables SI 1 1 — "V for treatment trajectory data. Critical appraisal of the 
studies is shown in Table III. NOQ ratings indicated no low-quality 
study, three average studies (30%) and seven high-quality studies 
(70%) with substantial inter-rater agreement (S.G. and C.M.V.: 
Cohen's k = 0.750, P = 0.007; S.G. and J.B.: Cohen's k = 0.872, 



P < 0.00 1 ). See Table SIV of supplementary data for details on critical 
appraisal of the studies. 

Meta-analysis 

Compliance rates 

Figure 2 shows the pooled TARC rate for the random-effect model. 
One study, Brandes et al. (2009), did not report on data per cycle 
and was not included in the calculation of CAF rates. The 
meta-analysis showed that TARC was 78.2% [95% confidence interval 
(CI) 68.8-85.3%, / 2 = 99.I7], CAFI was 81.8% (73.3-88.1%, I 2 = 
98.66) and CAF2 was 75.3% (68.2-81.2%, I 2 = 95.64). 



Subgroup and meta-regression analyses 

Table IV presents the results of subgroup analysis performed. It was 
not possible to perform subgroup analysis on the basis of the geo- 
graphical location of the clinic, parity or embryo transfer policy, 
whether treatment was subsidized/reimbursed, the definition of 
initiated cycle and handling of cryopreserved embryo transfers, 
because at least one of the subgroups had only one or no study. Vari- 
ability among studies was explained only by how compliance was 
defined. More precisely, those studies that reported on the number 
of doctor-censored patients (and thus considered them to be com- 
pliers) presented higher TARC rates than studies that did not. The dif- 
ferences observed between subgroups related to the study design and 
population, length of follow-up and definition of treatment success 
were not significant. Finally, meta-regressions showed that publication 
year was not significantly related to compliance (TARC: Slope = 0.08, 
Z= 1.716, P = 0.086; CAFI: Slope = 0.06, Z= 1.312, P= 0.190; 
CAF2: Slope = 0.03, Z= 0.813, P= 0.4 1 6). 

We excluded the study by Rufat et al. (1994) from the examination 
of associations between per cycle success rates of the clinics and sub- 
sequent compliance because, as already explained, Rufat pooled data 
from several fertility clinics. In addition, another study, Brandes et al. 
(2009), did not report data per cycle and could not be included. 
The clinic's first-cycle success rate was not significantly associated 
with compliance after that first cycle (CAFI: Slope = 0.49, Z = 
0. 1 96, P = 0.845) and the clinic's second-cycle success rate was not 
associated with compliance after that second cycle (CAF2: Slope = 
0.34, Z= 0.1 37, P= 0.891). 



Study quality and publication bias 

We performed subgroup analysis according to study quality (moderate 
or high) but the results of this analysis were not significant (see 
Table IV). 

Egger's test indicated the presence of publication bias for TARC 
(intercept = 14.21, t = 6.045, P< 0.001), CAFI (intercept = 10.54, 
t= 5.31, P = 0.001) and CAF2 (intercept = 6.22, t = 4.70, P = 
0.002). Investigation of publication bias through visual inspection of 
the funnel plot (see Supplementary data, Figs. SI -III) was confirmed 
only for CAFI, where one study was found to the lower right of 
the pooled compliance rate and none to the left (Supplementary 
data, Fig. S2). The trim and fill method only identified one missing 
study for CAFI , estimating a new compliance rate of 80.5% (95% CI 
72.0-86.9). 



Table I Sample characteristics reported in the 10 included studies. 



jtuuy 


Country 


Sample size 


Selected population 


Age of women in 


Duration of 


Parity (none or 








If yes, description 


years, mean + SD 
(range) 


infertility in years, 
mean + SD 


at least one 
child) 


Brandes, et al. (2009, 


The Netherlands 


373 


No 


CONP:3 1 .0 + 4. 1 , 


COMP:I.3l + 1.0, 


NR 


201 1) 








DISC33.3 + 5.1 


DISC: 1 .9 + 1 .68 




De Vries, etal. (1998, 


Belgium 


1 169 


No 


COMP:3l + 4.3, 


NR 


NR 


1999) 








DISC32 + 5.5 a 






Emery et al. ( 1 997) 


UK 


130 


No 


32.21 + 3.37 


8.27 + 2.97 


None 


and Slade et al. 














(1997) 














Land et o/. ( 1 997) 


The Netherlands 


1 97 


No 


NR 


NR 


NR 


Pearson et al. (2009) 


USA 


2245 


Excluded patients using donor 
gametes 


35.2 + 4.3 (20-49) 


NR 


At least one child 


Rufat eta/. (1994) 


France 


8362 


No 


33.1 + 4.3 


NR 


NR 


Smeenk et a/. (2004) 


The Netherlands 


380 


No 


34.1 + 3.9 (21-43) 


3.7 + 2.2 (1-16) 


NR 


Stolwijket a/. (1996) 


The Netherlands 


616 


Excluded patients using donor 
gametes 


NR 


NR 


NR 


Verhagen et a/. 


The Netherlands 


588 


Excluded patients starting IVF 


COMP:32.9 + 3.6, 


COMP:3.0 + 2.2, 


NR 


(2008) 






for preimplantation genetic 
diagnosis, surgical sperm 
aspiration or using donor 
gametes 


DISC33.8 + 4.l a 


DISC3.5 + 2.4 a 




Witsenburg et al. 


The Netherlands 


750 


No 


33.0 + 4.0 


NR 


NR 


(2005) 















IVF, In vitro fertilization; COMP, group of patients who complied with treatment; DISC, group of patients who discontinued; NR, not reported; USA, United States of America; UK, United Kingdom. 
a Average age and duration of infertility for total sample not reported. 



Table II Design characteristics of the 10 included studies. 



Study Prospective Data Data available on number of Definition of cycle Definition of Number of Follow-up IVF cycles exclude Subsidized/ 

design* 1 collection doctor-censored patients (yes/no, start (started treatment embryo period (< 1 2 cryopreserved reimbursed 

(yes/no) period if yes, reason for censoring) ovarian success (positive transfer months, > 12 embryo transfers treatment 

stimulation, had test, positive policy months 0 ) (yes/no) (yes/no) 

oocyte retrieval) scan, live birth") 



Brandes et at. (2009) 
De Vries et at. (1999) 
Emery et at. (1997) 
Land et at. (1997) 

Pearson et at. (2009) 

Rufatet at. (1994) 
Smeenk et at. (2004) 
Stolwijk et at. ( 1 996) 



No 
No 
Yes 
No 

No 

No 
Yes 
No 



Verhagen et at. (2008) No 



Witsenburg et at. (2005) No 



2002-2004 
1993-1996 
I year 
1993-1994 



1994-1998 
and 1999- 
2003 

1988-1992 
1999-2000 
1 988- 1 993 



2000-2003 



1996-2000 



Yes, 'poor prognosis (doctor's refusal)' 

No 

No 

Yes, 'denied further treatment for medical 
reasons (poor response to hMG or poor 
fertilization)' 

No 



Ovarian stimulation Positive scan 

Ovarian stimulation Positive test 

Ovarian stimulation Positive test 

Ovarian stimulation Positive scan 

Ovarian stimulation Live birth 

No Oocyte retrieval Positive scan 

Yes, 'active censuring' Ovarian stimulation Positive scan 

Yes, 'a previous treatment with a fertilization Ovarian stimulation Positive scan 

rate of < 1 0%, despite the presence of more 

than three large follicles (15 mm) on the day 

of HCG administration and the performance 

of oocyte aspiration, or three or less large 

follicles during two previous treatments' 

Yes, 'active censuring (poor response, poor Ovarian stimulation Positive test 
fertilization, poor response with poor 
fertilization, overweight with BMI >30 kg/ 
m 2 , hypertension or improved semen quality 
not requiring ICSI any more)' 

No Ovarian stimulation Live birth 



NR 
NR 
NR 
NR 

NR 

NR 
NR 
NR 



NR 



> 12 months NR 

> 12 months NR 

> 12 months NR 
< 12 months NR 



NR 

> 1 2 months 

> 1 2 months 
NR 



NR 



Yes 

NR 

No d 
No c 



No'-' 



Maximum of two < I 2 months No e 
when age <38, 
maximum of 
three when age 

>3 



Yes 
NR 
Yes 
Yes 

NR 

NR 
Yes 
NR 



Yes 



Yes 



NR, not reported; hMG, human menopausal gonadotrophins; HCG, human chorionic gonadotropin; BMI, body mass index; ICSI, intra cytoplasmic sperm injection. 
Prospective studies are those where study design and data collection happened before any information on the outcome of interest was collected. 
b Positive test: positive (3hCG urine/blood test, positive scan: fetal heart activity at 6/7 weeks. 
c or adequacy of follow period sufficiently justified by authors. 

d No information was given about how cryopreserved embryo transfer cycles were considered. 

transfers of cryopreserved embryos were considered to be part of the cycle from which the embryos resulted. 
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Table III Quality ratings for the 10 included studies using an adapted Newcastle-Ottawa Quality assessment scale. 



Study 


Quality criterion 








Overall qualit; 
rating (0-7) 


Representative 

population" 

(0-1) 


Ascertainment of 
treatment trajectory b 
(0-3) 


Comparability 0 (0-2) 


Follow-up d 
(0-1) 


Brandes ef al. (2009) 


1 


3 


2 


1 


7 (high) 


De Vries et al. ( 1 999) 


1 


2 


2 


1 


6 (high) 


Emery et al. (1997) 


1 


2 


2 


1 


6 (high) 


Land et al. ( 1 997) 


1 


2 


2 


1 


6 (high) 

V o / 


Pp-irrnn pt nl /"JOOQ'l 
rcdilUM cL Ul. 1 A\J\jy t 


1 


1 


1 


l 


4 (nnodsr3.ts) 


Rufatet ai (1994) 


1 


2 


2 


1 


6 (high) 




1 




2 


0 


A r'hiah'l 

o mign; 


Stolwijk et al. ( 1 996) 


1 


2 


1 


i 


5 (moderate) 


Verhagen et a/. (2008) 


1 


2 


2 


i 


6 (high) 


Witsenburg et al. (2005) 


1 


1 


2 


i 


5 (moderate) 


% of studies that meet criteria 


100% 


20% meet three criteria 


80% meet two criteria 


90% 


70% (high) 






60% meet two criteria 


20% meet one criteria 




30% (moderate) 






20% meet one criteria 






0% (low) 



a The 'representativeness criterion' was met when >80% of eligible patients were invited and >80% agreed to participate, orwhen the study reported on all consecutive series of patients 
over a defined period of time, or when sample size was >300 ( I point). 

b The 'ascertainment of treatment trajectory' criterion was met if the study provided enough data to ascertain that withdrawal from treatment was premature (before three cycles 
completed and not pregnant and not due to poor prognosis; I point), that withdrawal was either permanent (at least 1 2-monthperiod since last treatment cycle or permanence sufficiently 
justified by authors) or not only from the target clinic (patients did not go to other clinics) ( I point) and that withdrawal was ascertained from secure records (i.e. medical records, I point). 
The 'comparability criterion' was met if all participants did treatment during the same period (i.e. data collection period was < 5 years) ( I point); and sample was homogeneous regarding 
access to treatment (i.e. insurance coverage or number of subsidized cycles was described) or poor prognosis factors (i.e. mean age for all sample <40 or no statistical significant 
difference in age between groups) or type of treatment (all patients received the same treatment protocol), or IVF cycles excluded cryopreserved embryo transfer excluded ( I point). 
d The 'follow-up criterion' was met if all cases were accounted for or completion rate (number of patients with outcome at follow-up divided by the number of patients that initiated) was 
>80% or description of patients lost to follow-up showed lack of bias (I point). 

The overall quality rating was the sum of met criteria (maximum seven). Quality ratings were grouped into low (0-3), moderate (4-5) and high (6-7) quality studies. 



Study 



Brandes etal 2009 
De Vries eta) 1999 
Emeiyetal 1997 
Land etal. 1997 
Pearson etal 2009 
Rufatet all 994 
Smeenketal 2004 
Stolwijk etal 1996 
Verhagen et al 2008 
Witsenburg etal 2005 
Pooled event rate 



Statistics for each study 
Event rate Lower limit Upper limit 


Event rate and 95% Q Weight (Random) 
0 00 0.50 1 00 Relative weight (%) 


0.898 


0.863 


0.925 






9.797 | 


0.681 


0.654 


0.707 






10.210| 


0.800 


0.722 


0.860 






9.512| 


0.695 


0.628 


0.756 






9.881 | 


0.656 


0.636 


0.675 






10.243 | 


0.494 


0.484 


0.505 






10.269 | 


0.858 


0.819 


0.890 






9.919| 


0.792 


0.758 


0.822 






10.110 | 


0.903 


0.876 


0.924 






9.954 | 


0.845 


0.818 


0.869 




4 


10.105 | 


0.782 


0.688 


0.853BH 









Figure 2 Typical regimen compliance (event rate and 95% CIs) in ART treatment (TARC). 



Compliance and treatment success rates 

Two studies (Brandes et ai, 2009; Pearson et ai, 2009) did not report 
on the number of pregnancies achieved for the first three ART cycles 
and were not included in the calculation of the overall and typical 
regimen success rates. The overall success rate for the first three 
cycles, which included everyone who started treatment, was 42.7% 
(32.6-53.6%, I 2 = 98-8%). The typical regimen cycle success rate, 
which included only compliers (as defined in the TARC formula), 
was 57.9% (49.4-65.9%, I 2 = 97.0%). 



Discussion 

This meta-analysis shows that the vast majority of patients will comply 
with the typical ART regimen of three cycles, with about 2 of 10 
patients discontinuing treatment earlier than would have been 
expected. Although many studies have pointed to alarmingly low com- 
pliance rates in ART (Malcolm and Cumming, 2004; Rajkhowa et al., 
2006), doctors can expect that 78% of patients will opt to undergo 
their ART regimen until they achieve pregnancy or are advised to 



Table IV Compliance rates (typical and after the first or second failed cycle) according to subgroup analysis. 



Variables 




Typical ART regimen com 


pliance (TARC) 






Compliance 


after first failed cycle (CAF 1 ) 


Compliance after second failed cycle (CAF2) 




k 


Compliance 


95% CI 


95% CI 




k 


Compliance 


95% CI 


95% CI 


X Compliance 


95% CI 


95% CI x 2 








LL 


UL 






rate 


LL 


UL 




LL 


UL 


Clinical 


























Population 










0.085 










0.1 15 




0.351 


General ART 


7 


77.3 


64.0 


86.7 




6 


80.9 


69.5 


88.7 


73.7 


62.7 


82.3 


Selected ART population 


3 


80.2 


60.2 


9 1 .6 




3 


83.6 


68.2 


92.4 


78.5 


64.1 


88.2 


Geographic location 










NA 










NA 




NA 


Europe 


9 


79.4 


67.2 


87.9 




8 


82.6 


72.0 


89.8 


76.6 


67.1 


84.1 


USA 


1 


65.6 


22.6 


92.6 




1 


76.1 


36.6 


94.6 


64.8 


33.5 


87.1 


Patient 


























Parity 










NA 










NA 




NA 


0 


1 


80.0 


72.2 


86.0 




1 


91.3 


84.1 


95.4 


77.6 


66.9 


85.6 


> 1 child 


1 


65.6 


63.6 


67.5 




1 


76.1 


73.9 


78. 


64.8 


61.6 


67.9 


Methodological 


























Prospective design 










0.439 










1.617 




0.479 


Yes 


2 


83.2 


63.1 


93.5 




2 


89.2 


74.3 


95.9 


79.4 


64.5 


89.1 


No 


8 


76.8 


66.3 


84.8 




7 


79.4 


69.3 


86.8 


74.1 


66.3 


80.7 


Data available on number 










4.088* 










0.642 




3.341 


doctors-censored patients 


























Yes 


5 


84.2 


75.5 


90.2 




4 


84.6 


73.9 


91.4 


80.4 


72.7 


86.3 


No 


5 


70.6 


58.3 


80.5 




5 


79.2 


67.9 


87.3 


70.7 


62.7 


77.7 


Length of follow-up 










0.007 










0.267 




0.651 


Twelve months or more 


5 


77.0 


61.6 


87.4 




4 


79.1 


65.2 


88.4 


71.3 


60.4 


80.1 


> 1 2 months 


2 


78.0 


52.7 


91.9 




2 


83.9 


66.0 


933 


77.9 


63.6 


87.6 


Definition of initiated cycle 










NA 










NA 




NA 


Started hormonal stimulation 


9 


80.5 


73.7 


85.9 




8 


83.8 


78.7 


87.9 


77.2 


70.1 


83.0 


Had oocyte retrieval 


1 


49.4 


23.8 


75.4 




1 


57.7 


35.8 


76.9 


59.2 


35.4 


79.4 


Definition of treatment success 










0.141 










0.905 




0. 148 


Live birth 


2 


76.3 


45.4 


92.6 




2 


83.8 


6 1 .7 


94.3 


75.4 


53.8 


88.9 


Positive scan at 6/7 weeks 


5 


77.1 


58.8 


88.9 




4 


77.2 


59.6 


88.6 


73.8 


58.4 


84.9 


Positive fibCG urine/blood 


3 


81.1 


58.4 


92.9 




3 


86.1 


69.9 


94.3 


77.6 


60.6 


88.7 


test 


























IVF cycles exclude cryopreserved 










NA 










NA 




NA 


embryo transfers 


























No 


4 


85.3 


80.2 


89.2 




4 


87.4 


83.4 


90.6 


83.2 


80.7 


85.4 


Yes 


1 


65.6 


49.2 


79.0 




1 


76.1 


73.9 


78.1 


64.8 


61.6 


67.9 


Quality 










0.015 










0.101 




0.239 


High 


7 


78.6 


65.9 


87.4 




6 


81.0 


70.3 


88.4 


74.0 


63.5 


82.3 


Moderate 


3 


77.3 


56.4 


90.0 




3 


83.3 


69.0 


91.8 


77.8 


64.1 


87.3 


*P< 0.05, **P< 0.01, *** P< 0.00 1 , k = 


number of studies, CI 


= confidence intervals, LL = lower limit, UL 


— upper limit, NA 


not applicable 


because at least one of the subgroups only has 


one study, bold indicates P< 0.05. 
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end treatment. Compliance is likely to decrease with ART failure, from 
82% after the first failed cycle to 75% after the second failed cycle, but 
the decrease does not seem to be a function of the efficacy of the 
clinic. Compliance rates varied between 71 and 84% as a function of 
how compliance was defined (especially inclusion or exclusion of 
doctor-censored patients). Results suggest that a less rigorous defin- 
ition of compliance may result in it being underestimated. To reach 
a definitive estimation of compliance in fertility treatment, researchers 
and practitioners need to reach consensus on the definition, monitor- 
ing and reporting of compliance. The chance of achieving a pregnancy 
for patients who initiated a typical three-cycle ART regimen was 43%, 
but 58% for those who complied. Patients need to be informed from 
the start of treatment of the possibility of facing a compliance decision 
(i.e. to continue treatment or not) and that chances of treatment 
success are optimal when people comply with recommendations. 

The typical regimen ART compliance rate was 78% in a patient 
population who was expected to undergo treatment until they 
achieved pregnancy or were recommended to end treatment. It is re- 
assuring for patients and clinics alike to realize that only about 2 of 10 
patients do not comply with recommendations. Studies that reported 
on doctor censoring yielded even higher compliance rates. The 
reporting of active censoring is critical because it allows calculation 
of a compliance rate that takes into account whether the end of treat- 
ment was due to patient initiative or due to doctor recommendation. 
Including actively censored patients in the discontinuation group is mis- 
leading because these patients comply with medical recommendation. 
However, many studies do not consider this or other conceptual 
issues such as differentiation between permanent and temporary dis- 
continuation or between definitive abandonment of treatment or of 
treatment at a given clinic only. This hinders research on compliance 
in ART, not only when assessing its prevalence but also when trying 
to understand its causes. This meta-analysis showed that when only 
the best available evidence is considered, compliance is 84%, support- 
ing the idea that compliance in ART is indeed high. 

The finding that compliance decreased with successive experience 
of unsuccessful cycles suggests that failure discourages couples from 
carrying on with treatment (Akyuz and Sever, 2009), maybe as a 
result of a subjective perception of poor prognosis or other factors 
such as cost. Although we could not do a subgroup analysis consider- 
ing whether treatment was subsidized/reimbursed, the compliance 
rate when we considered only studies that clearly stated that treat- 
ment was subsidized/reimbursed was 84%. It may also be that ART 
is too demanding, an explanation consistent with patients' own 
stated reasons for discontinuation (Smeenk et al., 2004; Verhaak 
et al., 2007; Brandes et al., 2009; Boivin et al., 2012; Gameiro et al., 
2012). As such, the compliance rate can also indicate that for 22% 
of couples the cost of treatment (financial, emotional) may be too 
high. It is relevant to note that the clinic's success rate per cycle 
(first and second cycles) was not associated with subsequent com- 
pliance, indicating that the clinic's efficacy does not dictate the compli- 
ance of their patient, despite strong beliefs within clinical communities 
that patients leave clinics with lower success rates (Marcus et al., 
2005). It may be that patients disregard clinic success rates in favour 
of subjective perceptions of individual chances of success. It may 
also be that patients consider other outcomes beyond efficacy such 
as quality of care (van Empel et al., 201 I) when considering uptake 
of further treatment. 



Our results show that in every 100 typical couples starting ART 
treatment, 78 comply with three cycles and of these, 43 can expect 
to achieve pregnancy or live birth. However, if full compliance could 
be reached, 58 patients would achieve a pregnancy or live birth, 
which represents a 1 5% higher rate of success (if all other factors, in- 
cluding prognosis, are equal across three ART cycles). Therefore, 
addressing causes of non-compliance could help more people 
become parents, with a maximum estimated increase in success 
rates of 15%. In terms of number of treatment cycles, we would 
expect each clinic in Europe to carry out an additional I 10 cycles 
per year if there was full compliance (based on European data: 
402,039 cycles for 2007 in 1029 reporting clinics, excludes frozen 
embryo transfers, de Mouzon et al., 2012). Although a more precise 
knowledge of why patients discontinue treatment is still lacking, 
there are indications that to increase compliance clinics should focus 
on organizing treatments so that burden is diminished as much as pos- 
sible and ensuring that patients receive support to meet the demands 
of treatment (see Gameiro et al., 20 1 2 for reasons for discontinuation 
and Boivin et al., 20 1 2 for an integrated model of fertility care). In add- 
ition, more explicit communication about compliance with patients 
and between health care providers is needed. Reports have shown 
that only 60% of women deciding to stop fertility treatment were sat- 
isfied with their decision (Peddie et al., 2004) and most felt they lacked 
the necessary information and counselling support (Peddie et al., 
2005). Explicit information that ART success is likely to require mul- 
tiple cycles and that treatment may entail emotional and physical 
side effects and disruptions to daily life, for example, would help 
address issues previously cited as causes of discontinuation (Rauprich 
et al., 20 1 I ; Boivin et al., 20 1 2; Gameiro et al., 20 1 2) and help patients 
have more realistic expectations of what a typical ART regimen entails 
for an optimal chance of pregnancy. 

Strengths and limitations 

Considering the increasing debate surrounding the issue of compliance 
in fertility treatment, and in particular in ART, a meta-analysis on this 
literature was timely and appropriate. The strengths of this review are 
its systematic review of 30 years of research on discontinuation from 
seven databases, which yielded 10 studies from five countries, sam- 
pling the treatment trajectories of 14 810 patients. Data were inde- 
pendently extracted and quality evaluations made according to 
standard protocols for all studies. Compliance rates were calculated 
according to a clearly defined specification for the typical ART 
regimen and after-ART failure, which was consistent with ART prac- 
tice and guidelines. Analytic methods included the overall 
meta-analysis and a priori-defined subgroup analyses according to rele- 
vant clinical, patient and methodological characteristics. Publication 
bias, including trim and fill, provided reliable estimates for 'missing' 
studies. Finally, although high heterogeneity in compliance rates was 
observed (above 95%), it was mainly due to statistical artefact and 
methodological issues. By statistical artefact, we mean that the major- 
ity of published meta-analyses report on effect sizes (e.g. risk ratios) 
from which it is statistically possible to remove the between-studies 
variance in base rates for the phenomenon under investigation (e.g. 
1% difference in a base rate of 3 and 4% versus 80 and 81%). 
However, this is not the case in single-group studies and therefore 
meta-analyses of prevalence rates invariably produces high 
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heterogeneity (Borenstein et al., 2009, e.g. I 2 of 94% in a recent 
meta-analysis of the prevalence of depression in primary care, Mitchell 
and Sanjay Rao, 2009). All studies were published in peer-reviewed 
journals. They were of moderate to high quality and the quality of 
the studies was due to the fact that all used representative samples, 
and most studies could demonstrate homogeneity between compliers 
and non-compliers at the start of treatment and provided high com- 
pletion rates for follow-up. The presence of publication bias for com- 
pliance after the first failed cycle (CAFI) did not markedly influence 
the magnitude of the rate reported (estimated to be 1 .3% lower). 

Despite these strengths, there were some limitations in primary re- 
search that were transmitted to the meta-analysis. In particular, the re- 
search does not provide a full account of patient progression through 
ART. Studies report on the proportion of patients who opted to 
undergo or stop treatment, but do not fully explain what then hap- 
pened to patients registered as ending treatment at a particular 
clinic. These patients may have permanently ended treatment, as we 
assume, or they may have temporarily stopped or moved to 
another clinic. Analysis of the forest plot also revealed that one 
study (Rufat et al., 1994) presented a somewhat lower compliance 
rate with typical ART regimen than the other studies. This is one of 
the only two studies (Rufat et al., 1994; Stolwijk et al., 1996) that 
cover the pre-ICSI period when many causes of male infertility 
could not be addressed with treatment, which could explain the 
lower compliance reported. Studies focusing on groups of patients 
with poor prognosis or on specific treatments were excluded from 
analysis to control for clinical heterogeneity (i.e. use of specialist treat- 
ments, defined clinical subpopulations) so compliance in these groups 
is not known. Although these limitations need to be considered and 
addressed in the interpretation of the study findings and future re- 
search, the strengths of the systematic review and meta-analytic pro- 
cedures adopted support the view that the compliance estimates 
reported are reliable and reflect current best available evidence. 

Conclusions and future research 

Our results show that ~78% of patients undergo the cycles offered as 
part of the typical ART regimen, with uptake lower after ART failures 
but still high (82 and 75% after the first and the second failed cycles, 
respectively). These estimates are reassuring and should be transmit- 
ted to patients, who need to be informed from the start of treatment 
that, although ART is demanding, 8 out of every 10 patients comply 
with the typical regimen and that compliance with recommended 
cycles will offer the most optimal chance of success. Decision 
support should be developed to help people choose the best 
option (compliance, discontinuation) as ~22% will decide to end 
treatment for personal reasons and these patients need to be 
helped to reach equipoise about this decision. Future research 
should focus on trying to understand why patients discontinue 
treatment. 

Despite these encouraging results, a definitive estimate of compli- 
ance may still be lacking because of primary research not providing 
a full account of patients' progression through the ART cycles. To pro- 
gress compliance research, clinicians and researchers need to reach 
conceptual and methodological consensus on what is compliance 
and how to monitor it. An accurate assessment of compliance 
requires reporting the number of patients who undergo the typical 



ART regimen. While we studied three cycles, more or fewer ART 
cycles could be recommended depending on the patient population 
(e.g. poor responders) and ART protocol (e.g. minimal stimulation 
ART). In addition, patients who temporarily stop treatment, move 
on to another clinic or, as noted, are advised to end treatment 
should not be considered as non-compliers. In ART, there is no a 
priori time period in which the typical ART regimen should be com- 
pleted. Most studies, therefore, set time limits for undergoing 
another cycle, typically 1 2 months, after which patients are considered 
to have abandoned treatment. These time limits should be evaluated 
for their representativeness of typical cycle uptake and, when used, 
reported. There is voluminous literature on success rates in ART 
yet few studies also report the number of patients opting not to 
undergo ART, which undermines the research base. Finally, it should 
be noted that the literature focuses exclusively on not undergoing 
the typical ART regimen (i.e. premature discontinuation). However, 
non-compliance can also occur when patients are advised to stop 
treatment but resist this idea (Boivin et al., 2005) and choose to con- 
tinue ART at other clinics (i.e. over-persistence). This behaviour 
should also be monitored to reach an accurate estimation of the 
prevalence of 'over-persistence' and to obtain a better understanding 
of why couples are not able to follow recommendations to stop treat- 
ment. In summary, a precise assessment of compliance implies mon- 
itoring patients' long-term treatment trajectories. Such an endeavour 
requires the inclusion of compliance in national ART registers (e.g. 
in the UK the Human Fertilisation and Embryology Authority). 

Supplementary data 

Supplementary data are available at http://humupd.oxfordjournal- 
s.org/. 
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